Goto

Collaborating Authors

 plug-in estimator


Variance-Reduced Manifold Sampling via Polynomial-Maximization Density Estimation

arXiv.org Machine Learning

Uniform sampling on implicitly defined manifolds is a core primitive in motion planning, constrained simulation, and probabilistic machine learning. MASEM addresses this problem by entropy-maximizing resampling, but its resampling weights depend on a local k-nearest-neighbour density estimate whose errors can be amplified by aggressive resampling temperatures. We ask whether a polynomial-maximization moment estimator can replace the plug-in density rule without changing the surrounding MASEM architecture. The proposed PMM-MASEM module computes shell spacings from nested k-nearest-neighbour radii, estimates their standardized cumulants, and uses a gated PMM2/PMM3 estimator only when the spacing distribution departs from the flat Exp(1) regime; otherwise it falls back to the plug-in/MLE rule. This fallback is essential: on a flat homogeneous manifold the plug-in estimator is already the MLE, so PMM should not outperform it. A local Known-DGP Monte Carlo experiment confirms this gate: the selector returns MLE on flat Exp(1) spacings and reduces density MSE by 22--36% on asymmetric gamma and boundary-spacing regimes. The evidence is not uniformly positive: PMM3 worsens a platykurtic uniform spacing law, and a lightweight resampling-proxy experiment improves seven-lobes coverage but degrades the sine and swiss-roll proxies. The current evidence therefore supports an applicability-boundary result rather than a general MASEM improvement claim.


Low-rank Optimal Transport: Approximation, Statistics and Debiasing

Neural Information Processing Systems

The matching principles behind optimal transport (OT) play an increasingly important role in machine learning, a trend which can be observed when OT is used to disambiguate datasets in applications (e.g.


Generating DDPM-based Samples from Tilted Distributions

arXiv.org Machine Learning

Given $n$ independent samples from a $d$-dimensional probability distribution, our aim is to generate diffusion-based samples from a distribution obtained by tilting the original, where the degree of tilt is parametrized by $θ\in \mathbb{R}^d$. We define a plug-in estimator and show that it is minimax-optimal. We develop Wasserstein bounds between the distribution of the plug-in estimator and the true distribution as a function of $n$ and $θ$, illustrating regimes where the output and the desired true distribution are close. Further, under some assumptions, we prove the TV-accuracy of running Diffusion on these tilted samples. Our theoretical results are supported by extensive simulations. Applications of our work include finance, weather and climate modelling, and many other domains, where the aim may be to generate samples from a tilted distribution that satisfies practically motivated moment constraints.



Supplement to " Rates of Estimation of Optimal Transport Maps using Plug-in Estimators via Barycentric Projections "

Neural Information Processing Systems

For the moment, it is worth noting that such sets of functions (e.g., Haar wavelets, Daubechies wavelets) are readily We are now in a position to present the main theorem of this subsection. To avoid repetition, we defer further discussions on the rates observed in Theorem A.1 to Remark 2.7 where a holistic In fact, by Proposition 1.1, there exists an optimal transport map Based on (B.2), the natural plug-in estimator of ρ Suppose that the same assumptions from Theorem 2.2 hold. B.2 Nonparametric independence testing: Optimal transport based Hilbert-Schmidt independence criterion Proposition B.2 shows that the test based on Further, when the sampling distribution is fixed, Proposition B.2 shows that In the following result (see Appendix C.2 for a proof), we show that if This section is devoted to proving our main results and is organized as follows: In Appendix C.1, we Further by Lemma D.2, we also have: ϕ Note that (C.10) immediately yields the following conclusions: S By (1.5) and some simple algebra, the following holds: null null null S Combining the above display with (C.9), we further have: null null null null 1 2 W Combining the above observation with Theorem 2.1, we have: lim sup For the next part, to simplify notation, let us begin with some notation. By using the exponential Markov's inequality coupled with the standard union Now by using [7, Theorem 2.10], we have P (B We are now in a position to complete the proof of Theorem 2.2 using steps I-III. Therefore, it is now enough to bound the right hand side of (C.17).




Dimension-Free Empirical Entropy Estimation

Neural Information Processing Systems

We seek an entropy estimator for discrete distributions with fully empirical accuracy bounds. As stated, this goal is infeasible without some prior assumptions on the distribution. We discover that a certain information moment assumption renders the problem feasible. We argue that the moment assumption is natural and, in some sense, minimalistic -- weaker than finite support or tail decay conditions. Under the moment assumption, we provide the first finite-sample entropy estimates for infinite alphabets, nearly recovering the known minimax rates. Moreover, we demonstrate that our empirical bounds are significantly sharper than the state-ofthe-art bounds, for various natural distributions and non-trivial sample regimes. Along the way, we give a dimension-free analogue of the Cover-Thomas result on entropy continuity (with respect to total variation distance) for finite alphabets, which may be of independent interest.